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Abstract 

Background: We propose a mathematical model for multichannel assessment of the 
trial-to-trial variability of auditory evoked brain responses in magnetoencephalography 
(MEG). 

Methods: Following the work of de Munck etal., our approach is based on the 
maximum likelihood estimation and involves an approximation of the spatio-temporal 
covariance of the contaminating background noise by means of the Kronecker product 
of its spatial and temporal covariance matrices. Extending the work of de Munck etal., 
where the trial-to-trial variability of the responses was considered identical to all 
channels, we evaluate it for each individual channel. 

Results: Simulations with two equivalent current dipoles (ECDs) with different 
trial-to-trial variability, one seeded in each of the auditory cortices, were used to study 
the applicability of the proposed methodology on the sensor level and revealed spatial 
selectivity of the trial-to-trial estimates. In addition, we simulated a scenario with 
neighboring ECDs, to show limitations of the method. We also present an illustrative 
example of the application of this methodology to real MEG data taken from an 
auditory experimental paradigm, where we found hemispheric lateralization of the 
habituation effect to multiple stimulus presentation. 

Conclusions: The proposed algorithm is capable of reconstructing lateralization 
effects of the trial-to-trial variability of evoked responses, i.e. when an ECD of only one 
hemisphere habituates, whereas the activity of the other hemisphere is not subject to 
habituation. Hence, it may be a useful tool in paradigms that assume lateralization 
effects, like, e.g., those involving language processing. 

Keywords: Evoked responses, Habituation, Lateralization, Kronecker product, 
Maximum-likelihood estimation, MEG, Noise covariance, Trial-to-trial variability 



Background 

Measurements of evoked brain responses in humans are an established standard in 
magneto- (MEG) and electroencephalography (EEG). The measured quantity is either 
the magnetic field above the surface of the head in MEG [1] or the electric field on the 
surface in EEG [2]. The sources of activity that underlie the measured signal are elec- 
tric currents generated in a certain area of the cortex. Unfortunately, the background 
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noise present in MEG and EEG measurements significantly exceeds the amplitude of 
the evoked brain activity of interest. The model commonly used to describe these data 
is the so-called signal-plus-noise (SPN) model. It assumes that: 1) the brain response 
to identical stimuli does not change from trial to trial, i.e. from one stimulation to 
another, and 2) the noise contaminating the response has a Gaussian distribution with 
zero mean. Hence, simple averaging of the sequentially recorded responses with respect 
to stimulus onset is supposed to yield a good estimate of the "real" brain response to the 
stimulus. 

In the framework of the SPN model, the signal R^f* measured on the /-th channel, at the 
y-th time moment, in the /c-th trial, can be described as 



lf=R il + S\ (1) 



where i = 1, j = 1, ...,/, k = 1, . . . ,K. Rij is the brain response to the stimulus, 
which is assumed to be constant from trial to trial, and s is a Gaussian noise with zero 
expected value, whose amplitude is typically larger than the amplitude of the evoked brain 
response. Due to the assumption of the Gaussian, zero-mean nature of the noise, s is 
supposed to vanish on averaging across a large number of R^s. 

Averaging across trials is useful when one is interested in the estimation of the evoked 
response, but in this approach information on the response variability across single tri- 
als is lost. In fact, the trial- to- trial variability is well documented by physiological studies. 
The analysis of single evoked responses has been addressed in various ways. For example, 
Bartnik et al. extracted and parametrized single evoked potentials in EEG by means of the 
wavelet transform [3]. Effern et al. [4], Quian Quiroga & van Luijtelaar [5], and Bagshaw & 
Warbrick [6] tried to increase the signal-to-noise ratio of single trials by wavelet denois- 
ing. Various statistical approaches— e.g., [7-9]— including new dedicated tests [10] were 
also proposed. De Munck et al. used a maximum-likelihood estimation [11], in which 
the spatial and temporal covariance of the contaminating noise was employed. Several 
coherence aspects were studied, for example, in [12-14]. Westerkamp & Aunon proposed 
a linear multielectrode filter to extract the single-trial responses [15], while Georgiadis 
et al. used the Kalman recursive filtering to obtain an estimate of a single-trial response by 
employing the information from the preceding trials [16]. Matching pursuit in the eval- 
uation of the trial-to-trial variability of the auditory responses in MEG was used, e.g., 
in [17-20]. 

This paper aims at improving the quality of the estimators of auditory evoked brain 
responses by taking into account their trial-to-trial variability on the multichannel level. 
The current work is an extension of the work of de Munck et al. [11], where the spatio- 
temporal response pattern of each trial was weighted by a scalar coefficient assumed 
identical for all MEG channels. We propose to enhance that approach by estimating a 
weighting coefficient for each individual channel. This enables distinguishing between, 
e.g., habituation trends which differ for distinct groups of channels thus reflecting differ- 
ent trial-to-trial variabilities of the underlying sources— something which is not possible 
in the approach of de Munck et al. [11]. We apply the proposed method to simulated data 
as well as an illustrative real MEG data set from an auditory experiment, and discuss its 
limitations in the Discussion. 
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Methods 

Assumptions 

Based upon the realistic assumption that the trial-to-trial variation of evoked brain 
response should be taken into account when estimating the response amplitude, the 
following extension of the SPN model was proposed by de Munck et al. [11]: 



R (k) 



(*) 



(2) 



where, for each trial /<, is a coefficient scaling the spatio-temporal amplitude pattern 
of the response, R, in this trial In that work, the spatial and temporal covariance of the 
contaminating noise was estimated in order to minimize its influence on the estimates of 
a and R. 

In this work, we propose to replace the scalar with a diagonal matrix xj/^ in order 
to trace the trial-to-trial amplitude variation for individual channels i = 1, . . . , /: 



r / (*) 



(*) 



0 



0 
0 



0 0 



/ (k) 



(3) 



Therefore, the order of the matrix equals the number of channels /. In this way, in 
every single trial /<, each of the channels i is assigned to the scalar coefficient correspond- 
ing to the evoked response amplitude variations on this channel Thus, the fundamental 
equation of the considered process, (2), is replaced by 



Additionally, the following assumption is made 



(4) 



K 0 ... 0 
0 K . . . 0 

0 0 ... K 



(5) 



which, since K is the number of trials, says that in the case of non-variability of the 
response, each of the components on the leading diagonal of \jf^ is equal to one. 

The model (4) poses further difficulties, because the dimensionality of the estimation 
process increases, yet we hope it offers a more accurate reflection of reality. Since it is 
known that the brain response evoked by a characteristic sensory stimulus arises in a par- 
ticular area of the cortex (see, e.g., [21-24]), the magnetic fields measured by the channels 
corresponding to this area a are stronger than those acquired from less active parts of the 
brain. Thus, it is reasonable to distinguish between individual channels when estimating 
the variability of the evoked response. For example, in their study of the habituation of the 
P300 wave for visual stimuli, Ravden & Polich observed that habituation was strongest at 
the Fz and Cz electrode sites [25]. 

Following the approach of de Munck et al. [26], the following additional assumptions 
concerning the noise space-time correlation are made: 

• the spatio-temporal covariance matrix, C, of the noise is the Kronecker product (see, 
e.g., [27]), of the spatial, X, and the temporal, T, covariance matrices, 



C = X®T, 



(6) 
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i.e., for Xn> denoting the scalar being the spatial covariance between channel i and 
channel i f , and Tjf being the temporal covariance between sample ; and sample /, 
C is of the following form 



X\\T\\ . . . X\\Tij 
X\\Tji . . . X\\Tjj 




XuTii . . . XuT\j 
XuTji . . . XuTjj 








X/iTn . . . Xi\T\j 
Xn Tj\ ... Xji Tjj 




X n Tn . . . X n T\j 
XjjTji . . . XjjTjj _ 



(7) 



• the noise of different trials is uncorrelated. 

Although the dimensions of C = X <S> T are not smaller than the dimensions of the 
spatio-temporal covariance of the noise built traditionally, assumption (6) permits split- 
ting C into X and T. Since the noise covariance matrix has to be estimated and inverted, 
this splitting is a clear advantage thanks to the fact that (X (g) T)~ l = X~ l (g) T~ l . 

It is clear from (6) that a common pattern of both X and T across trials k is assumed, 
as well as that X is time invariant and that T is space invariant, which of course is a 
simplification (see [28]), yet, a computationally efficient one. 



Estimation of evoked response varying across trials 

To derive the maximum-likelihood estimators of X, T, xj/^ and R, one needs to find the 
distribution of Due to the assumption of the zero-mean, Gaussian nature of the noise 
and the aforementioned assumptions concerning the noise correlation, the covariance 
between two noise realizations (one on channel /, at sample in trial /<, and the other on 
channel i' , at sample /, in trial k') equals [26] 

E(# ) 4/ ) ) =Xi*lj/8 kk >, (8) 
where 5^' is the Kronecker delta defined as 
f 1 for k = k 

hk> = \ . (9) 

[ 0 for k jt k . 

Thus, the determinant of the overall covariance equals [27] 

det (E (efefj?)) = (detXY K (det T) IK . (10) 
Moreover, since 

E « } - ^) (r% (t-%, »* - ff% f ) = 

(ii) 



k 

the probability density distribution f e of e is expressed by 



f s {R,f (k \X, T) =exp (-\T, tr ({ Rm ~ * (k)R ) T ~ l ( Rm ~ f^Rfx- 1 )^ ■ 



(12) 



(27r)-^ 2 (det T)-^ 2 (detXr rK/2 ■ 
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In the context of the maximum-likelihood estimation of X, T, x//^ and R, (12) represents 
the likelihood function. For the sake of keeping the formulas simple, all estimators are 
denoted as non-dashed (i.e. by X instead of X, for example). 

Calculation of the estimates of X, T, f and R 

Differentiating (12) with respect to X, T, x//^ and R, and setting the obtained derivatives 
to zero, one obtains the following estimators (see Appendix for detailed derivations). 



X = j- J2 (R (k) - f (k) R) T- 1 (*<*> - i, (k) R) T , (13) 

k 

T = 2k E ( R<k) ~ <A V 1 (R (k) - f (k) R)> d4) 

k 

= dg ^x-i 0 (RT-^y 1 diag (x-iRtoT- 1 *?)^ , (15) 
R=(j2 ^ ( * } *- V ( * } ) E f {k) X~ l * {k) > (16) 

V k Ik 



where O denotes the Hadamard product, i.e. (Y Q Z) mn = Y mn Z mn} diag denotes the 
operator resulting in a vector that contains elements of the leading diagonal of the matrix 
that diag acts on, and dg denotes the operator resulting in a diagonal matrix whose leading 
diagonal contains elements of the leading diagonal of the matrix (or elements of the vector, 
as in this case) that dg acts on and zeros elsewhere. 

The above system of matrix equations (13)-(16) has no closed form solution. Hence, in 
order to find the approximated estimates of X, T, if/ and R, we propose to solve it itera- 
tively, starting with X, T and xj/^ being identity matrices, and with R = K~ l ^ K R^ k \ In 
such an approach, consecutive updates of X, T, \\r and R are computed based on their val- 
ues from the previous iterations in which formulae (13)-(16) are computed consecutively. 
At the end of each iteration, each \\f{ is scaled according to (5). 

Material 

Real MEG data were acquired from a healthy male volunteer (age 42, normal hearing), 
who is a member of the Special Lab Non-invasive Brain Imaging. The subject was exposed 
binaurally to 500-ms long 1-kHz sine tones at 90 dB SPL. The difference between the 
onsets of the consecutive stimuli was 2 s. The data were sampled at the frequency of 
1017.25 Hz and filtered between 0.5 and 400 Hz. 160 trials were extracted out of 224 
as artifact free. They were offset corrected for each channel with respect to the mean 
magnetic field over the 200-ms long prestimulus baseline time window. 

Results 

Simulations 

In order to test to which extent the proposed methodology correctly assesses the 
channel-dependent trial-to-trial variability of the spatio-temporal response pattern R, we 
simulated two equivalent current dipoles (ECDs), one in each auditory cortex, whose time 
courses were habituating differently over trials. On the sensor level, the simulated signal 
can then be expressed as 

Rf =L l {i)h l {k)s l {j) +L 2 (i)h 2 (k)s 2 U) + 4\ (17) 
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where L\ (i) denotes the leadfield of source 1 to sensor r, h\ (k) represents the trial-to-trial 
variation of the signal s\(j) of source 1, and likewise L^ii), hiik), and 52(7). Of course, 
Li(i) h\(k) s\(j) + ^2(0 h2(k) 52(7) is not identical to tyf^Rij in (4), thus (4) is an approx- 
imation rather than the exact representation. Therefore, the method formally applies, i.e. 
it is able to perfectly disentangle h\ from hit only when the two leadfields are mutu- 
ally orthogonal, i.e. when L\{i)L2{i) = 0 for all i; then, the Hadamard product of the 
two related forward fields L\(i) h\(k) s\(j) and Liii) ^2(^)^2(7) is a zero field. However, 
since the two simulated sources were clearly separated in space, we assumed that this was 
practically the case. 

Based on a magnetic-resonance-imaging (MRI) scan of our subject (see Section 
Material), we selected one location per auditory cortex where the M100/N100 response 
(see, e.g., [29-31], and references therein) to simple 1-kHz tones is supposed to be gener- 
ated. Assuming the spherical head model, at each of these locations we seeded a tangential 
ECD of unit moment (figure 1), whose amplitude time course was simulated as the 
[x, y, z] -moment multiplied by a normalized time series generated as 51(7) = sin(3///) 
for the left hemisphere and 52(7) = sin(3/// + 7T/25) for the right hemisphere, to sim- 
ulate the M100 wave at its peak proximity and to account for a possible phase (or peak 
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Figure 1 Simulated ECD locations. Two tangential ECDs of unit length were seeded (at the back ends of 
the red arrows) in the auditory cortices. The corresponding forward field is shown in the bottom-right panel. 
The spherical head model was applied, and the green sphere was fitted using FieldTrip [32], based on 1 92 
surface points selected from head digitization. Only the points close to the two auditory cortices were used 
for the fit. The blue lines intersect at the anterior commissure. 
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latency) difference between the time courses of the two sources; see figure 2. In the sim- 
ulation, / = 51, which covered 50 ms at the sampling frequency of 1017.25 Hz. These 
trial-invariant time courses were next multiplied across trials by scalar coefficients orig- 
inating from two linear trends (one for each of the two ECDs) with negative slopes, to 
model the habituation of the two sources across trials. Namely, each of the two linear 
trends was a vector containing K weighting coefficients. Those vectors were simulated as 
h\(k) = —k + 2K for s\ and h2(k) = —k + 3K/2 for S2, and scaled according to (5) each, 
resulting, for K = 160, in the slopes: —0.0041 for the left and —0.0060 for the right source. 

The forward field resulting from the two sources was calculated for each trial k and 
contaminated by noise taken from the — 50-0-ms prestimulus baseline period of individ- 
ual trials of a real MEG measurement of the same subject, whose MRI was used to seed 
the simulated ECDs. Both the forward-field signal and the noise were normalized with 
respect to their norms across all channels r, samples j and trials /c, and then the noise was 
scaled such that the overall signal-to-noise (SNR) ratio equaled 0.1, a very demanding but 
realistic value, which we based on an estimate of SNR from a real data set. 

Figure 3 shows the estimates obtained from the iterative solution of equations 
(13)-(16) after 20 iterations. The top panel shows the estimated trial-invariant pattern 
of the response, R, at / = 25, i.e. between the two M100 peaks originating from the two 
ECDs. A strong correspondence to the forward field of the simulated ECDs (see figure 1) 
can be seen. Since the estimated if/, which scales R across trials (see (4)), is typically noisy, 
lst-degree polynomials were fitted (see [11,17,18]) to the xj/ estimates on each channel in 
order to reveal the main tendency in the overall trial-to-trial variability. Figure 3 (middle) 
shows the slopes of the fitted polynomials for each channel. We only show the values for 
those sensors at which the value of the slopes was significantly different from zero at the 
Bonferroni-corrected significance level of 0.05/245, where 245 is the number of artifact- 
free channels (we excluded 3 sensors from the 248-channel set of our 4D MEG system due 
to strong artifacts on those channels; let us remind here that we used a real-measurement 
noise in these simulations). Sensors marked with empty circles denote the slopes which 
were not significantly different from zero. Note that those are predominantly channels 
with poor SNR, as revealed by the top panel. The significant slopes reveal the difference 
in habituation between the two auditory cortices, with the mean equal to —0.0041 for 
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Figure 3 Results of simulations. Top: the trial-invariant spatial response pattern R at 7 = 25, i.e. for the time 
point at the middle of the analyzed time window (see also figure 2) in a 2D projection of the MEG sensors 
layout (open black circles), for the simulated data generated as the superposition of the habituating activity 
(see figure 2) of the ECDs depicted in figure 1 and the contaminating spatially and temporally correlated 
background noise, at the overall SNR = 0.1 . Middle: Spatial distribution of the slope coefficients of 1 -st degree 
polynomials fitted to the multichannel estimates of f reflecting the trial-to-trial variability of/?. Sensors 
marked with blue-filled circles denote the negative slopes significantly different from zero at the 0.05/245 
significance level, where 245 is the number of analyzed channels, revealing a clear habituation. The exact 
color of a circle reflects the value of the slope. Bottom: The estimate of the temporal covariance T of the noise. 
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the left-hemisphere sensors and —0.0056 for the right hemisphere, compared to —0.0041 
and —0.0060 at the source level. In both cases, the corresponding standard deviation o 
was equal to 0.0007. Hence, the habituation lateralization on the source level was well 
reconstructed on the sensor level despite the demanding overall SNR of 0.1. 

The bottom panel of figure 3 shows the estimate of the temporal covariance, T, of the 
contaminating noise, which reveals some nonstationarity [33,34]. We do not show the 
somewhat peculiar pattern of the estimate of X, since it stems from the particular spiral 
arrangement of the sensors labels in the 4D MEG system, which makes visual inspection 
of X impaired. 

For illustrative purposes, figure 4 shows, for a selected channel revealing a strong signal 
and a clear habituation, the estimated ifs^ (black dots) and the consecutively fitted lst- 
degree polynomial. 

In addition, we simulated a scenario in which only one of the two sources, namely 
the one in the right hemisphere, habituated. For that source, habituation strength was 
the same as before, whereas for the left-hemisphere EDC, h\(k) = 1 independent 
of /c. Figure 5 shows results of the estimation, in full analogy to those from figure 3. 
Note that the algorithm managed to reconstruct the habituation on the sensors that 
were located above the right hemisphere, while those for the left hemisphere remained 
essentially unaffected. This time the mean of the significantly non-zero slopes of the 1st- 
degree polynomials fitted to the estimates of ifr for the right-hemisphere sensors was 
equal to —0.0066 (with a = 0.0012), still close to the slope of the simulated /z2, i.e. 
to -0.0060. 

In order to test the performance of the method in situations in which the leadfields of 
the ECDs are far from mutual orthogonality, we simulated an additional scenario in which 
a third ECD was placed in the right hemisphere; see figures 6 and 7. The time course of 
this third ECD was simulated as 53(7) = sin(3;// + tt/ 15) normalized with respect to the 
peak value. Due to two ECDs in the right hemisphere in this scenario, the overall ECD 
signal strength in that hemisphere was larger, compared to that in the left hemisphere. 
Moreover, due to the aforementioned scaling of the signal and the noise resulting in an 
overall SNR of 0.1, the SNR for the left-hemisphere sensors was now weaker than in the 
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Figure 4 Habituation. 1 st-degree polynomial (blue line) fitted to xfr for a right-hemisphere channel 
revealing a strong signal and a clear habituation (see also figure 3). For this channel, the fitted line 
p(k) = -0.0058 k+ 1.2681. 
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Figure 5 Results of simulations. Like in figure 3 but for another simulation in which the left-hemisphere 
ECD did not habituate. Note the spatial selectivity of the habituation estimate on the sensor level (the middle 
figure). 



previous simulations. To examine the influence of the trial-to-trial characteristics of the 
third source on the overall habituation pattern for the right hemisphere, we assumed that 
this additional source did not habituate, i.e. that its h^(k) = 1 for all k (h\ and hi were like 
in the first simulation; see figure 2). Since, as aforementioned, ECDs whose leadfields are 
not distinct enough, i.e. not orthogonal or almost orthogonal, can not be efficiently sep- 
arated, the estimated overall habituation pattern for the right hemisphere is weakened in 
this scenario; see figure 8. This is due to the fact that the third source did not habituate. 
The observed pattern is a mixture of the trial-to-trial characteristics of the two sources 
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Figure 6 Simulated ECD locations. Like in figure 1 but with the additional (third) ECD in the right auditory 
cortex. 



in the right hemisphere; the mean of the significantly non-zero slopes of the lst-degree 
polynomials fitted to the estimates of \jr for sensors above the right-hemisphere was equal 
to —0.0036 (with a = 0.0006). This will of course be the case for any real-data sce- 
nario with multiple sources located close to each other. However, since the exact number 
of ECDs is not known in practice, we do not consider such an "integrative", and hence 
impaired, resolution of the method a serious flaw. The fact that we do not know, and hence 
do not want to assume explicitly, the exact number of ECDs for real data is essentially the 
reason for estimating the trial-to-trial variability on the sensor rather than source level. 

Since, as mentioned above, the SNR for the left-hemisphere sensors in this scenario 
was smaller than in the two previous simulations, the habituation pattern for the ECD 
of the left auditory cortex was now revealed less clearly. Compared to the result shown 
in figure 3, the habituation pattern of the left hemisphere is now weaker (although 
some sensors still reveal significant decay, with the mean significant slope —0.0049, 
cr = 0.0003). 



Real MEG data 

Figure 9 shows the findings for real MEG data (see Section Material) after 20 iterations 
(see Section Calculation of the estimates ...). The analyzed time window was 70-120 ms, 
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Figure 7 Simulated ECD time courses. As in figure 2 but for the three sources shown in figure 6, with the 
green color corresponding to the additional ECD in the right hemisphere. 



which covered the positive/negative part of the M100 waveform for the left/right channel 
with the best SNR. 

The top panel of figure 9 shows the estimated trial invariant response RdXj = 25 cor- 
responding to t = 95 ms. This pattern looks very close to the average signal at this time 
point. In the middle panel, a clear habituation pattern is visible for several sensors located 
above the right auditory cortex, as they reveal significantly negative slopes at the 0.05/245 
significance level. The mean of those significant slopes is —0.0041 (cr = 0.0005). In this 
subject, the trial-averaged signal reveals an overall worse SNR above the left, compared 
to the right hemisphere, which might be the reason for this particular lateralized result. 

The bottom panel shows the estimated temporal covariance matrix T of the noise, 
which reveals some tiny traces of nonstationarity. 

Figure 10 for the real-data analysis is fully analogous to figure 4 for the simulation. Alter- 
natively to the lst-degree polynomial, one might fit an exponential function. However, 
given the observed level of the inter-trial variance of \j/ and the fact that in the real-data 
scenario the exact characteristics of habituation is not known, we decided to follow the 
Ockhams razor principle and hence opt for linear fits. Besides, exponential fits resulted 
in exactly the same channels revealing significant habituation. 

Discussion 

We have introduced an extension of the maximum-likelihood approach for the estima- 
tion of the trial-to-trial variations of evoked brain responses in MEG based on the work 
of de Munck et al. [11] to obtain sensor-specific estimates. In [11], the trial-invariant 
spatio-temporal response pattern was weighted in each trial by a scalar coefficient which 
was identical for all sensors. Here, we further developed this approach by estimating a 
coefficient for each individual sensor. We checked its applicability using simulated data 
and found that the lateralization of the habituation simulated on the source level could 
be reconstructed on the sensor level, even for a very demanding SNR. The analysis of 
illustrative real MEG data revealed an auditory-cortex related habituation pattern, too, 
showing that the method is capable of finding significant habituation in data from real 
experiments. 
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Figure 8 Results of simulations. Like in figures 3 and 5 but for the scenario depicted in figures 6 and 7. 



The estimation of the channel-dependent x// is computationally much more demanding 
compared to the scenario with a assumed identical to all channels [11]. This is so because 
the dimensionality of the problem increases, while the probability space becomes reduced 
by one dimension — the number of channels. This makes the overall estimation difficult 
and possibly less robust compared to the case with the scalar a^ k \ Assuming that all the 
cortical processes were characterized by the same trial-to-trial variability pattern, the use 
of the simpler approach of de Munck et al. [11] would be more appropriate. However, 
it is not realistic to assume a priori that all processes involved will habituate identically, 
especially not in scenarios with complex stimulation paradigms. In fact, some may even 



Sieluzycki and Kordowski BioMedical Engineering OnLine 2014, 13:75 
http://www.biomedical-engineering-online.eom/content/13/1/75 



Page 14 of 19 



>0 ^o 0 o^a8B^oo-ooo°o 
oooo 0 o 0 ^H^o 0 o°oo J ^o ( 

. oooooooooooor 
O o^ no0 oooooo0j 

~ ^ _ s~\ /~\ /~\ /~\ /~\ r\d. 



oom 



nOOOOOOSi in 
y OoOoOc i " 

o ooo r 

QXOOO) 
00 ^oOr, 0o0 ^IF 

ooO- 0 o 0 o 0 o°oi 0 c 
o0 o o o o o o o° o 



O 0 °Oo^o8o§&oO° 0 O 
0°0 0 0n °°o88 8oO° 0 0 0°0 

0 00 oooooooooooooo»«° 0 

„ ooooooooooooo . 

0° 0 oOO°0 00 0 00 00n *Oo 




o^oooo^o 




o0 oWo° 0 



MOO 



I 

• - 

I 



0 01 




Figure 9 Results for the real MEG data. Top: the estimate of R [fT] at j = 25 corresponding to f = 95 ms. 
Middle: the slopes of 1 st-degree polynomials fitted to the estimates of f. Sensors marked with blue-filled 
circles reveal a clear habituation at the restrictive 0.05/245 significance level. Bottom: the estimate of the 
temporal covariance T of the noise. 



reveal the opposite behavior, i.e. sensitization (see, e.g., [35]). Even for simple auditory 
stimuli, the habituation pattern may reveal differences between the two hemispheres — 
something that could not be addressed with the model of de Munck et al. [11]. Hence, 
in such cases, topographical representations of the trial-to-trial variability will be more 
appropriate. 

An important issue to be taken into consideration when applying the advanced 
approach presented here is the transition from the source-level model to the sensor-level 
model. If all sources were characterized by exactly the same trial-to-trial variability, the 
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Figure 1 0 Habituation in real-data analysis. The blue line depicts the 1 st-degree polynomial 

p(k) = -0.0042 k + 1 .2079 fitted to \j/ for a right-hemisphere channel revealing a strong signal and a clear 

habituation (see figure 9). 



model with scalar cr^ [11] would be sufficient, as it corresponds to linear scaling of 
the forward equation in each trial Such simple scaling is, however, not possible if each 
source has its own trial-to-trial variation, as assumed in the simulations (see Section 
Simulations). Hence, our model, given by (4), is not exact. We are aware of this inaccu- 
racy, yet, for sources that are sufficiently separated in space, as it is the case when a single 
ECD per auditory cortex is assumed, this inaccuracy is rather small. In this case, one can 
treat the problem almost as separate forward problems, one for each source. Of course, 
the closer in space the sources, the more problematic this issue becomes, which is why the 
applicability of the proposed approach in scenarios with differently habituating sources 
in close proximity to each other would be questionable. Alternatively, one could propose 
a linear expansion into several a^R products with scalar a^s, where each term would 
correspond to the forward field of a single underlying source. However, this approach 
would require knowledge of the exact number of sources a priori, something which is 
usually not available. Therefore, we consider our model to be a reasonable approximation 
in paradigms where one can assume spatially distinct sources. 

We critically examined the convergence of the algorithm and found that the estimates 
of xj/ and R show some tiny oscillations with consecutive iterations, i.e. that the ratio of 
the norms of the estimates in two consecutive iterations does not decay monotonically. 
Nonetheless, in numerous simulations as well as for real data, we found that 20 iterations 
is a number which, however somewhat arbitrary, results in the spatial distribution of the 
estimate of R resembling that of the average signal very well and in \J/ nicely revealing the 
habituation patterns of the underlying sources. A good reason for not proceeding further 
is— apart from the related time cost— the fact that for real data the algorithm might result 
in patterns of R which would differ from the trial-averaged signal a bit too strongly. This 
may be caused by some information flow (leakage) between the estimates of \J/ and R. 
Hence, we infer that about 20 iterations offer a reasonable trade-off between the quality 
of the estimate of ^ and that of the R estimate. 

The moderate nonstationarity in the estimated temporal covariance of the noise, 
depicted in figures 3, 5, 8, and 9, may stem from the intrinsic properties of the noise. A 
recent work by Kipiriski et al. [34], which examined nonstationarity of MEG data using 
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modern statistical tests adopted from econometrics, revealed variance-nonstationarity 
of the analyzed signals. However, the nonstationarity of T that we observed may also 
stem from a non-perfect separation of the stochastic noise from the quasi-deterministic 
evoked response. These considerations correspond to the ongoing debate on the mech- 
anism of the generation of evoked responses and the associated relation of the phase of 
spontaneous rhythms with respect to the stimulus onset (see, e.g., [36,37], and references 
therein). 

Endnote 

a We do not discuss here the difference between magnetometers or axial and planar 
gradiometers in this respect. An interested reader is referred to the famous paper of 
Hamalainen et al [1]. 

Appendix: Detailed derivations 

Estimation of X and T 

Since the likelihood function f e attains maximum for the same arguments as its natu- 
ral logarithm, we can differentiate In (f e ). We will calculate the derivative of the natural 
logarithm of (12) with respect to X, and set it to zero. Thus, we will differentiate 

a \nf e a 



dx dx 



(\ E tr ((* ( * } - * ( * } *) T ~ l ( R(k) - v^) V 1 ) 

+ In ({2n) IJK ' 2 (det T) IK ' 2 (detX)^ . 



+ 



(18) 



Let us notice that for any symmetric, non-singular matrix X, it is true (see [27], p. 180) 
that 

^detX=(detX)X- 1 . (19) 
oX 

Furthermore, since here X is square, it is true (see [27], p. 178) that 

a 

dx' 

where 

A = (*<*> - f^R) T- 1 (*<*> - ir^R) T . (21) 

Therefore, using basic properties of logarithm and the fact that both X and T are 
symmetric, we can write 

31n/ e 



; tr (^X -1 ) = - (X^AX' 1 ) 7 , (20) 



= \ E ( Rik) - * (k)R ) r_1 ( R(k) - ^) T * _1 ) - y*" 1 • 



After setting the expression above to zero, we obtain (13). 
Performing very similar derivation, one obtains (14). 

Estimation of jr^ and R 

Using the facts (see Table 4 on p. 178 in [27]) that 

a 



(22) 



and 



tr(AM) = A T (23) 
dM 

a 

—tr(MAM T B) = B T MA T + BMA , (24) 
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it is straightforward to obtain 

-\ 4^ £ tr - * (k>R ) r_1 ( R(k> - V ') = 

= -dg (x-W^t-W) + dg (;r V ( * 0 *r- 1 * T ) , 

(25) 

where dg is the operator which replaces every square matrix M with a new one, consisting 
of the leading diagonal of M and having zeros elsewhere. Then, the following algebraic 
manipulations lead to equation (15). For the first term of the right-hand side (RHS) of 
equation (25) we can write 

V/ (dg (x-W> T~ 1 R r )) u = (diag(jr 1 *<*'> I*- 1 * 1 )) , (26) 

where i = 1, ( )# denotes the ith element of the leading diagonal of an / x / matrix, 
and ( )i denotes the ith element of an / x 1 column vector. 
For the second term of the RHS of equation (25) we can write 

Vi (dg (x- 1 ^RT- 1 R r y) u = Y, x u l ff )R tlT^(R T )mi • (27) 

t,l,m 

where t = 1, ...,/,/ = 1, ...,/, m = 1, ... ,/. 

Since the RHS of equation (25) must be set to zero and since it contains diagonal terms 
only, we have / equations, one for each i, in which the RHS of equation (26) is equal to the 
RHS of equation (27). 

Next, we can rewrite the RHS of equation (27) as 

E** 1 (e*^- 1 ^) J) = Eft^f } , (28) 

t \l,m ) t 

where /3 is an / x / matrix whose elements are given by 

Pit = Xu 1 E^^ 1 ^)™ = X u' ( RT ~ lRj ) ti ■ ( 29 ) 

l,m 

Now, since T~ l is symmetric, so is RT~ l Fp ', and hence we can write 

ft Y = x- 1 (flr- 1 * 1 ) = (x- 1 o (i?r- 1 i? T )) . (30) 

Let us then denote the RHS of equation (26) by y/, i.e. the ith element of a column vector 
y. Since the RHS of (28) is assumed to be equal to //, we can write 

J2PufP = Yi (3D 

t 

and therefore, in the matrix notation, 

/Miag(V (/0 ) = y. (32) 
Then, of course, 

diag(^ r) ) =/rV. (33) 
and hence 

^«=dg(rV), (34) 
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which, given the RHS of (30) for the elements of /3 and the RHS of (26) for the elements 
of y, yields equation (15). 
Very similarly, i.e. using (23) and (24), one easily obtains (16). 
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